Minimax risks for sparse regressions: Ultra-high dimensional phenomenons

نویسنده

  • Nicolas Verzelen
چکیده

Abstract: Consider the standard Gaussian linear regression model Y = Xθ0 + ǫ, where Y ∈ R is a response vector and X ∈ R is a design matrix. Numerous work have been devoted to building efficient estimators of θ0 when p is much larger than n. In such a situation, a classical approach amounts to assume that θ0 is approximately sparse. This paper studies the minimax risks of estimation and testing over classes of k-sparse vectors θ0. These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of Xθ0), the inverse problem (estimation of θ0) and linear testing (testing Xθ0 = 0). Interestingly, an elbow effect occurs when the number of variables k log(p/k) becomes large compared to n. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting. Moreover, we compute the minimax risks when the variance of the noise is unknown. The knowledge of this variance is shown to play a significant role in the optimal rates of estimation and testing. All these minimax bounds provide a characterization of statistical problems that are so difficult so that no procedure can provide satisfying results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimax Optimal Rates of Estimation in High Dimensional Additive Models: Universal Phase Transition

We establish minimax optimal rates of convergence for estimation in a high dimensional additive model assuming that it is approximately sparse. Our results reveal an interesting phase transition behavior universal to this class of high dimensional problems. In the sparse regime when the components are sufficiently smooth or the dimensionality is sufficiently large, the optimal rates are identic...

متن کامل

Rate-optimal Posterior Contraction for Sparse Pca

Principal component analysis (PCA) is possibly one of the most widely used statistical tools to recover a low rank structure of the data. In the high-dimensional settings, the leading eigenvector of the sample covariance can be nearly orthogonal to the true eigenvector. A sparse structure is then commonly assumed along with a low rank structure. Recently, minimax estimation rates of sparse PCA ...

متن کامل

Sparse Regularization for High Dimensional Additive Models

We study the behavior of the l1 type of regularization for high dimensional additive models. Our results suggest remarkable similarities and differences between linear regression and additive models in high dimensional settings. In particular, our analysis indicates that, unlike in linear regression, l1 regularization does not yield optimal estimation for additive models of high dimensionality....

متن کامل

Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation

While several papers have investigated computationally and statistically efficient methods for learning Gaussian mixtures, precise minimax bounds for their statistical performance as well as fundamental limits in high-dimensional settings are not well-understood. In this paper, we provide precise information theoretic bounds on the clustering accuracy and sample complexity of learning a mixture...

متن کامل

High-dimensional classification by sparse logistic regression

We consider high-dimensional binary classification by sparse logistic regression. We propose a model/feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the non-asymptotic bounds for the resulting misclassification excess risk. The bounds can be reduced under the additional low-noise condition. The proposed complexity penalty ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012